Generalisations over Corpus-induced Frame Assignment Rules
نویسنده
چکیده
In this paper we discuss motivations and strategies for generalising over instance-based frame assignment rules that we extract from frame-annotated corpora. Corpus-induced syntax-semantics mapping rules for frame assignment can be used for automatic semantic role labelling of unparsed text, but further, to extract linguistic knowledge for a lexical semantic resource with a general syntax-semantics interface. We provide a data analysis of a comprehensive rule set of corpus-induced frame assignment rules, and discuss the potential of applying different types of generalisations and filters, to obtain a uniform extended data set for the extraction of linguistic knowledge.
منابع مشابه
Corpus-Based Induction Of An LFG Syntax-Semantics Interface For Frame Semantic Processing
We present a method for corpus-based induction of an LFG syntax-semantics interface for frame semantic processing in a computational LFG parsing architecture. We show how to model frame semantic annotations in an LFG projection architecture, including special phenomena that involve non-isomorphic mappings between levels. Frame semantic annotations are ported from a manually annotated corpus to ...
متن کاملCorpus-based Induction of a Frame Semantics Projection for LFG
In computational linguistics there is growing insight that high-quality NLP applications for information access (question anwering, etc.) are in need of deeper linguistic analysis, in particular, semantic analysis. A bottleneck for semantic processing is the lack of large-scale domain-independent lexical semantic resources. While WordNets for several languages are important lexical resources fo...
متن کاملImplementation and Evaluation of PAROLE PoS in a National Context
We are annotating the complete 20 million Dutch PAROLE corpus with PoS and lemma. The morphosyntactic tagging of 250,000 words during the PAROLE project was the first confrontation of the fine-grained Dutch PAROLE tagset and its ’functional’ mode of application, with real corpus data. The correction of the manual tagging and the compilation of a 100,000 words training corpus for the automatic t...
متن کاملA semantic tagging tool for spoken dialogue corpus
In this paper, we report our semantic tagging tool for spoken dialogue corpus. This tagging tool can acquire analysis rules using Transformation-based Learning (TBL) from small scale training corpus. It can learn dialogue act tagging rules and semantic frame tagging rules. The precisions are 72% in dialogue act tagging and 58% of semantic frame tagging in open test.
متن کاملA language-independent probabilistic model for automatic conversion between graphemic and phonemic transcription of words
In this paper we present a novel language-independent probabilistic model for automatic grapheme-to-phoneme and phoneme-to-grapheme conversion of words. In a fully unsupervised training procedure, two processes are applied; the transformation rules, which usually fail to provide the correct symbols, are eliminated, and new variable-length string transformation rules are defined improving the st...
متن کامل